Goto

Collaborating Authors

 constrained episodic reinforcement


Constrained episodic reinforcement learning in concave-convex and knapsack settings

Neural Information Processing Systems

We propose an algorithm for tabular episodic reinforcement learning with constraints. We provide a modular analysis with strong theoretical guarantees for settings with concave rewards and convex constraints, and for settings with hard constraints (knapsacks). Most of the previous work in constrained reinforcement learning is limited to linear constraints, and the remaining work focuses on either the feasibility question or settings with a single episode. Our experiments demonstrate that the proposed algorithm significantly outperforms these approaches in existing constrained episodic environments.


Review for NeurIPS paper: Constrained episodic reinforcement learning in concave-convex and knapsack settings

Neural Information Processing Systems

Weaknesses: My major concerns: 1. line 248 suggested linear programming could be used in ConPlanner, but instead the experiment tested on different unconstrained RL planners under Lagrangian heuristic. I think the papers should have compared results of different constrained problem solver. While theoretical proof was plenty, the paper didn't provide any empirical support, making this method less intuitive. Although the paper claimed they compared the proposed framework with other concave-convex approaches, the problems they experimented on didn't seem to be concave-convex. Grid world problem such as Mars rover applied in the paper has linear constraints instead of convex ones.


Review for NeurIPS paper: Constrained episodic reinforcement learning in concave-convex and knapsack settings

Neural Information Processing Systems

While it is true that constraints can typically be made part of the normal optimisation process in RL, by encapsulating them into the reward function, it can often be much easier to specify constraints directly, which is the setting this paper considers. The reviewers were positive about the motivation and execution of this paper, and were all in favour of accepting the paper. I would suggest already motivating this setting, at least somewhat, in the abstract, to help interesting readers find and appreciate this paper more easily.


Constrained episodic reinforcement learning in concave-convex and knapsack settings

Neural Information Processing Systems

We propose an algorithm for tabular episodic reinforcement learning with constraints. We provide a modular analysis with strong theoretical guarantees for settings with concave rewards and convex constraints, and for settings with hard constraints (knapsacks). Most of the previous work in constrained reinforcement learning is limited to linear constraints, and the remaining work focuses on either the feasibility question or settings with a single episode. Our experiments demonstrate that the proposed algorithm significantly outperforms these approaches in existing constrained episodic environments.